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ABSTRACT 

The  usual  Bayes-Stein  shrinkages  of  maximum  likelihood  estimates  towards 
a  common  value  may  be  refined  by  taking  fuller  account  of  the  locations  of  the 
individual  observations*  Under  a  Bayesian  formulation ,  the  types  of 
shrinkages  depend  critically  upon  the  nature  of  the  common  distribution 
assumed  for  the  parameters  at  the  second  stage  of  the  prior  model*  In  the 
present  paper  this  distribution  is  estimated  empirically  from  the  data, 
permitting  the  data  to  determine  the  nature  of  the  shrinkages.  For  example, 
when  the  observations  are  located  in  two  or  more  clearly  distinct  groups,  the 
maximum  likelihood  estimates  are  roughly  speaking  constrained  towards  common 
values  within  each  group.  The  method  also  detects  outliers?  an  extreme 
observation  will  either  be  regarded  as  an  outlier  and  not  substantially 
adjusted  towards  the  other  observations,  or  it  will  be  rejected  as  an  outlier, 
in  which  case  a  more  radical  adjustment  takes  place.  The  method  is 
appropriate  for  a  wide  range  of  sampling  distributions  and  may  also  be  viewed 
as  an  alternative  to  standard  multiple  comparisons,  cluster  analysis,  and 
nonparametric  kernel  methods. 
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SIGNIFICANCE  AND  EXPLANATION 

The  shrinkage  properties  of  Bayes-Stein  estimators  depend  heavily  on  the 
particular  choices  of  tail  behaviour  and  modality  for  the  mixing  distribution 
in  the  exchangeable  prior.  In  this  paper  the  mixing  distribution  is  therefore 
estimated  empirically,  and  nonparametrically  from  the  data  rather  that  being 
constrained  by  an  a  priori  choice  of  its  functional  form.  It  is  estimated  via 
a  modified  maximum  likelihood  procedure  as  a  discrete  distribution.  The 
consequent  posterior  estimates  place  considerable  emphasis  upon  the  scatter  of 
the  data  and  possess  rather  different  properties  from  standard  Bayes-Stein 
techniques  which  shrink  all  the  observations  towards  the  same  common  value. 

In  two  numerical  examples  the  method  proves  useful  both  for  detecting  outliers 
and  for  indicating  whether  the  data  should  be  divided  into  two  or  more  groups. 
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The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
summary  lies  with  MRC,  and  not  with  the  author  of  this  report. 


SOME  DATA-ANALYTIC  MODIFICATIONS  TO  BAYES-STEIN  ESTIMATION 


Tom  Leonard 

1 .  SIMULTANEOUS  ESTIMATION 

Consider  observations  which  are  independent#  given  respective  parameters 

and  w^ere  xi  possesses  density#  or  probability  mass  function  *or 

xi  e  t  and  9^  e  0,  for  i  -  1,...,m.  Suppose  further  that  the  9^  are  a  priori 
exchangeable  and  that  they  possess  the  prior  probability  structure  of  a  random  sample  from 
a  distribution  with  density  g(9^). 

Most  Bayesian  simultaneous  estimation  methods  (e.g.  Leonard#  1972#  Lindley  and  Smith 
1972,  and  Clevenson  and  Zidek,  1975,  for  binomial,  normal,  and  Poisson  situations)  take  the 
density  g  to  belong  to  a  parametrized  family#  and  then  introduce  second  stage 
distributional  assumptions  about  the  parameters  of  g.  The  choice  of  g  very  often 
involves  a  unimodal  density  with  thin  tails  e.g.  normal  or  Gamma.  These  choices  typically 
lead  to  posterior  estimates  of  the  9^  which  shrink  the  x^  towards  a  common  value  (e.g. 
zero#  the  prior  mean,  or  the  average  observation)  thus  providing  Bayesian  analogues  of 
frequentist  procedures  (e.g.  James  and  Stein,  1961,  and  Efron  and  Morris,  1973a). 

Whilst  the  previous  choices  of  prior  will  be  adequate  in  numerous  situations, 
shrinkages  towards  a  common  value  may  be  less  appropriate  in  cases  where  g  does  not 
assume  such  an  idealized  form.  For  example,  Dawid  (1973)  and  Leonard  (1974)  investigate 
prior  densities  with  thicker  tails  than  the  normal  and  show  that  it  is  then  unreasonable  to 
shrink  in  extreme  observations  as  radically  as  suggested  by  an  analysis  based  upon  a  normal 
prior*  Alternatively,  g  might  possess  more  than  one  mode  in  which  case  fairly  complex 
shrinkages  might  be  involved. 

In  the  present  paper  we  relax  previous  assumptions  involving  thin-tailed  unimodal 
densities  and  indeed  proceed  to  the  other  extreme  by  supposing  that  the  statistician 
possesses  absolutely  no  prior  information  about  the  density  g.  Our  motivation  is  to 
investigate  the  shrinkages  which  are  actually  suggested  by  the  data,  rather  than  imposed  by 
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particular  functional  forms  assumed  for  g.  If  there  were  some  partial  information  about 
g  then  this  could  be  introduced  via  the  method  proposed  by  Leonard  (1978)  for  smoothing 
densities;  this  aspect  will  not  however  be  considered  in  this  paper. 

We  will  explore  the  consequences  of  estimating  g  empirically  from  the  data.  Readily 
computable  estimates  will  be  obtained  which  avoid  problems  of  specifying  the  tail- 
behaviour.  modality,  and  general  shape  of  g. 

For  different  reasons.  Laird  (1978)  investigates  the  theoretical  properties  of  the 
maximum  likelihood  estimate  of  g.  obtained  by  maximizing  the  log- likelihood  functional 

m 

L(g)  -  I  log  /  f.(x  j8)g(0)d8  (1.1) 

i-i  e 

She  shows  that  the  maximum  likelihood  estimate  of  g  is.  under  certain  regularity 
conditions,  a  mixture  of  Dirac-delta  functions;  a  fairly  complex  scheme  based  upon  the  04 
algorithm  is  proposed  for  evaluating  the  optimum. 

In  the  next  section  we  employ  a  mathematical  device  reaching  to  a  simpler  estimation 
scheme  for  g;  this  leads  to  a  solution  maximizing  the  likelihood  functional  amongst  a 
particular  restricted  class  of  estimates.  Other  relevant  references  from  the  literature  in 
the  general  empirical  Bayes  area  are  well  catalogued  by  Laird. 

2.  THE  EMPIRICAL  ESTIMATION  OF  THE  PRIOR  DBISITY 

Consider  the  limiting  situation  where  the  sampling  variation  in  each  of  the 
fi*xi^i^  distributions  approach  zero,  so  that  the  0^  become  effectively  known  and 
equal  to  their  maximum  likelihood  estimates  0^.  In  this  limiting  case  the  maximum 
likelihood  estimate  of  g(0)  is 

g(8)  -  m~  I  <j  <e>  -  (8)  (8  e  6)  ( 2. 1 ) 

i»1  i  i-1  1 

where  (0)  denotes  the  Dirac-delta  function  at  0-0,.  This  motivates  us  to  consider, 
®i  1 

in  general,  estimates  for  g  which  take  the  fora 


<e  6  e) 
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g(Q)  *  m  1  l  $  (0) 

i-1  ai 

but  where  a,..., a  are  now  arbitrary  points  to  be  estimated  from  the  data.  We 
1  in 

anticipate  that,  when  the  first-stage  sampling  variation  is  reintroduced,  this  will  cause 

the  a^  to  adjust  the  0^  by  reducing  their  overall  spread,  and  hence  cause  a  sort  of 

Stein-effect  on  the  0.. 

i 

Substituting  the  function  in  (2.2)  for  g  in  (1.1)  provides  us  with  the  log- 
likelihood  of  a1,«.*,aa,  which  is  given  by 

■  m 

L(a)  *  l  log  l  f  (x.  ,a.  )  -  m  log  n  (2.3) 

i-1  k-1  1  x 

The  a^  will  be  estimated  by  maximizing  the  function  in  (2*2).  The  optimizing  values 
could  be  interpreted  as  hypothetical  observations  from  the  distribution  g  roughly 
speaking  equal  in  information  content  about  g  to  the  information  about  g  contained  in 
the  log- likelihood  functional  (1.1). 

Note  that  in  all  the  numerical  examples  we  have  considered,  the  optimal  values  for 

a.,..«,a  will  become  concentrated  at  a  smaller  number  of  estimated  points,  say 
i  ro 

bj,...,bp.  The  prior  probability  attached  to  point  bj  should  then  be  estimated  by 

g(b^)  -  »  b^)/m  (j  -  1,...,p)  (2.4) 

This  yields  a  discrete  distribution  which  assigns  estimated  probabilities  to  p 
estimated  points,  where  p  is  also  obtained  empirically.  We  anticipate  that  it  will  be 
close  in  numerical  terms  to  the  unrestricted  maximum  likelihood  estimate  proposed  by  Laird. 

Differentiating  the  function  in  (2.2)  with  respect  to  a^  gives  us,  after  some 
rearrangement 
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3  log  f  (x  *a#) 
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Note  that#  when  a4».*.#a  are  unequal,  the  expression  in  (2.6)  is  just  the  posterior 

1  B 

probability  that  9^  *  a^#  under  the  prior  distribution  in  (2*1).  Therefore#  solving  the 
aaxiwuB  likelihood  equations  for  the  a^  also  gives  us  empirical  estimates  for  the  entire 
posterior  distribution  for  each  9^  for  i  *  1#*..#B|  bo  that  posterior  estimates  may 
also  be  obtained  for  the  9^. 

Equating  the  derivatives  in  (2.3)  to  zero  yields  a  set  of  equations  which  may  in 
general  be  solved  by  any  standard  iterative  procedure  e.g.  Newton-Raphson .  However#  the 

[  computations  turn  out  to  be  particularly  simple  in  a  variety  of  special  cases* 

l 

|  (t)  Exponential  family  of  sampling  distributions 

'  When  the  *aaq>ling  densities  f^  assume  the  forms 

f 

f  f1(xi|9i)  «  exp{B(9^)  ♦  t(x^)C(9^)  +  ©(x^)}  (2.8) 

for  appropriate  choices  of  the  functions  B#C#D#  and  t#  then  the  maximum  likelihood 
equations  for  the  a^  are 
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where  the  P^  are  defined  in  (2*4).  Equations  (2.7)  may  be  solved  by  substituting  trial 
values  (initially  the  values  9^)  for  the  a^  in  the  right  hand  sides#  transforming  the 
left  hand  sides  into  fresh  values  for  the  a^  and  then  cycling  until  convergence.  For 
example#  when  the  x^  possess  Poisson  distributions  with  respective  means  9  ,  we  have# 


(2.10) 
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clearing  demonstrating  that  each  a.  takes  the  form  of  a  weighted  average  of  x.,..«,x  t 

*  i  m 

so  that  the  overall  spread  of  the  will  be  less  than  that  of  the  x^. 

(b)  Binomial  distributions  with  unequal  sample  size 

If  the  are  independent  and  possess  binomial  distributions,  given  the 

corresponding  probabilities  9^  and  sample  sizes  n^  then  the  maximum  likelihood 

equations  for  the  a^  are  given  by 
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where  we  may  take  the  A^  in  the  expression  for  in  (2.6)  to  satisfy 
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(2.12) 


since  the  functional  contributions  to  the  sampling  distribution  cancel  themselves  out. 

Note  that  -2  log  A^  takes  the  form  of  a  distance  measure  between  x^/n^  and  a^.  Hence 

a£  in  (2.11)  will  depend  more  heavily  upon  those  x^/n^  nearby  then  on  outlying  x^/n^ 

This  creates  a  mechanism  enabling  a  ,...,a  to  take  full  account  of  the  random 

1  m 

variability  in 

(c)  Normal  Observations  with  Unknown  Variance 

Suppose  now  that  for  i  -  and  j  -  1,...#ni»  the  observations  x^  are 

independent  and  normally  distributed  with  respective  group  means  0  and  common  variance 
2  2 

o  .  Then  a  may  be  estimated  jointly  with  the  prior  values  a^  by  solving  the  joint 
maximum  likelihood  equations 
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where 
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H  -  l  n  , 
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and  the  are  defined  in  (2.6),  with 

A^  -  exp{-  —  ^0  2(x^  -  a^)2)  (2*14) 

Equations  (2.12)  and  (2.13)  may  be  solved  by  combining  the  iterations  recommended  in 

2  2 
(a), for  fixed  a  ,  with  simple  cyclic  substitutions  on  a  . 

The  above  procedure  may  be  employed  in  either  the  Model  I  or  Model  II  ANOVA  situations 
since  our  assumptions  relate  either  to  an  exchangeability  model  for  fixed  effects,  or  a 
random  effects  model.  Note  that  the  classical  F-test  for  equality  of  the  means  may  be 
replaced  by  an  inspection  as  to  whether  or  not  all  the  estimated  a^  are  equal;  t-tests 
for  individual  differences  may  be  avoided  by  comparing  the  posterior  means  discussed  in  the 
next  section. 


3.  POSTERIOR  ESTIMATION  OF  THE  SAMPLING  PARAMETERS 

Once  the  iterations  have  been  completed  for  the  a^  and  P^r  the  parameters 

6  , ...,0  may  be  estimated  e.g.  by  their  empirical  posterior  means 
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For  example.  In  the  normal  situation  (2.12)  we  have 


1  n  x  (I  P  #/I  n.  P.  a ) 

■  ,  1  x  •  «  x*  .  ,  x  l*' 

i*1  1*1  x*1 


(3.2) 


which  can  be  arranged  in  the  form  of  a  weighted  average  of  x ,  ...,x  .  Again,  as 

i  m 

-  2  log  A^,  from  (2.14),  is  a  distance  measure  between  and  a^,  the  posterior  mean 

in  (3.2)  will  take  more  account  of  x^'s  which  are  close  to  x^  rather  than  those  which 
are  some  distance  away.  We  suggest  that  (3.2)  will  in  many  practical  situations  be 
preferable  to  the  James-Stein  estimator,  as  far  as  meaningful  statistical  interpretations 
are  concerned  since  it  does  not  shrink  all  the  x^  irrevocably  towards  a  common  value 
without  taking  into  account  the  statistical  scatter  of  the  data. 


4.  NUMERICAL  EXAMPLES 

The  data  in  Table  1  related  to  the  males  and  females  on  10  different  courses,  and  were 
previously  analyzed  by  Leonard  (1972)  using  a  Bayes-Stein  estimation  technique  for  binomial 
data. 

Table  1.  Classification  of  Students  According  to  Sex  and  Course 


Course 

Female 

Male 

%  of  Females 

Bayes-Stein 

Qnpirical 

1 

42 

47 

47.2 

44.4 

44.0 

2 

32 

40 

44.4 

41.6 

44.0 

3 

45 

57 

44.1 

42.1 

44.0 

4 

10 

16 

38.5 

34.5 

43.2 

5 

7 

20 

25.9 

26.7 

21.1 

6 

3 

12 

20.0 

24.1 

18.2 

7 

3 

13 

18.8 

23.6 

17.3 

8 

5 

22 

18.5 

22.3 

15.7 

9 

12 

72 

14.3 

16.9 

15.7 

10 

11 

84 

11*6 

14.5 

15.3 
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The  rows  of  ths  table  were  not  originally  arranged  according  to  the  values  of  the 
percentages)  the  present  ordering  is  intended  simply  for  ease  of  presentation . 

The  Bayes-Stein  estimates  in  the  fifth  column  shrink  each  observed  proportion  towards 
an  average  value  of  28.0.  The  amounts  of  shrinkage  vary  according  to  sample  size  and 
according  to  distance  from  the  average  value  when  measured  on  a  logistic  scale. 

Application  of  our  empirical  method  in  Section  3b  yielded  an  estimated  common  prior 
distribution  for  the  binomial  probabilities.  This  assigned  prior  probabilities  4/10  and 
6/10  to  the  values  0.440  and  0.153. 

We  see  from  the  last  column  of  Table  1  that  our  empirical  procedure  has  discerned  that 
the  observed  percentages  lie  in  too  clearly  distinct  groups.  It  has  moreover  decided  that 
the  fourth  percentage  lies  in  the  first  group,  and  therefore  pulls  the  38,5  value  right  up 
to  43.2,  in  the  opposite  direction  than  the  radical  shrinkage  to  34.5  which  was  suggested 
by  James-Stein.  The  first  three  percentages  are  regarded  as  equal  with  the  fourth 
percentage  just  a  small  distance  away. 

The  second  group  of  six  percentages  causes  shrinkages  for  the  first  five  which  are  all 
opposite  in  direction  to  that  suggested  by  Bayes-Stein.  Percentage  number  5  is  slightly 
unwilling  to  join  the  group,  because  of  possible  inclinations  to  either  join  the  first 
group  or  to  stay  on  its  own.  Overall  the  differences  from  James-Stein  are  quite 
remarkable. 

We  also  reanalyzed  the  famous  baseball  batting  example  introduced  by  Efron  and  Morris 
(1974).  Again,  the  c ommon  prior  distribution  was  estimated  by  a  two-point  discrete 
distribution,  but  this  time  the  two  points  were  close  enough  together  to  retain  Bayes-Stein 
type  shrinkages  towards  a  common  value.  Interestingly  our  posterior  means  were  virtually 
identical  to  the  estimates  proposed  by  Efron  and  Moris  even  though  the  latter  were  based 
upon  very  different  (parametric)  assumptions.  Therefore  our  estimates  seem  to  agree  with 
Bayes-Stein  when  the  scatter  of  the  data  is  well-enough  behaved  to  justify  these  simple 
shrinkages* 
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The  data  in  Table  2  comprise  a  subset  of  a  well-known  14  *  14  contingency  table 
introduced  by  Karl  Pearson  11904).  The  entri.es  in  the  fourth  column  give  the  proportions 
of  sons  who  follow  their  father's  occupation,  for  each  of  fourteen  occupations;  the 
categories  have  again  been  rearranged  into  a  suitable  order. 

Table  2:  Proportions  of  Sons  Following  Their  Father's  Occupation 


Occupation  (i) 

ni 

Observed  Proportion 

Smoothed  Proportion 

1 

0 

26 

0.000 

0.020 

2 

6 

88 

0.068 

0.103 

3 

11 

106 

0.104 

0.103 

4 

7 

54 

0.130 

0.115 

S 

6 

44 

0.137 

0.127 

6 

4 

19 

0.211 

0.221 

7 

18 

69 

0.261 

0.257 

8 

9 

32 

0.281 

0.270 

9 

6 

18 

0.333 

0.334 

10 

23 

51 

0.451 

0.477 

11 

54 

115 

0.470 

0.480 

12 

20 

41 

0.468 

0.480 

13 

28 

50 

0.560 

0.480 

14 

51 

62 

0.823 

0.823 

-9- 


In  this  case  our  empirical  prior  distribution  assigned  respective  probabilities  1/14 , 
4/14,  4/14,  4/14  and  1/14  to  the  points  0.020,  0.103,  0.257,  0.480,  and  0.823,  representing 
a  number  of  interesting  features  in  the  scatter  of  the  data.  The  corresponding  posterior 
means  we  described  in  the  fifth  column  of  the  table. 

The  first  two  groups  illustrate  that  our  method  can  be  used  to  decide  whether  or  not 
particular  observations  are  outliers.  The  second  proportion  (0.068)  has  been  pulled  back 
into  the  main  group,  whilst  the  first  proportion  (0.000)  has  been  left  virtually  alone. 
Similarly  the  14th  proportion  (0.823)  is  left  alone  by  the  fifth  group  whilst  the  ninth 
proportion  is  of  interest  as  an  internal  outlier  isolating  Itself  between  the  third  and 
fifth  groups. 

Our  method  provides  a  type  of  cluster  analysis  since  it  groups  the  observations  into 

definite  clusters.  Also,  the  method  seems  to  be  robust  under  deviations  from  the 

assumption  of  exchangeability  of  &  ,...,8  •  If  there  is  strong  evidence  in  the  cluster  to 

l  m 

refute  exchangeability  for  a  particular  parameter  then  the  latter  is  simply  estimated  as  an 
outlier  without  radically  effecting  the  other  estimates.  Indeed,  our  method  effectively 
splits  the  parameters  up  into  exchangeable  subsets  thus  providing  an  alternative  to  the 
Efron  and  Morris  (1973b)  procedure  for  deciding  whether  to  combine  possibly  related 
estimation  problems.  Finally,  our  method  could  be  viewed  os  an  alternative  to  standard 
techniques  for  multiple  comparisons  since  it  smooths  the  data  to  a  form  where  it  is  easy  to 
compare  subsets  of  the  parameters. 

5.  RELATIONSHIP  WITH  NONPARAMETRIC  KERNEL  METHODS 

Suppose,  for  simplicity,  that  belongs  to  the  symmetric  location  family 

VvV  *  f(,xi  *  8i,)  (5,1) 

Then  our  method  estimates  the  marginal  density 

5(x)  -  /  f(|x  -  9  I )g(9)d6  (5.2) 

6 


by 
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(x  e  x> 


(5.3) 


t(x)  -  m-1  l  f(|x  -  a  |) 
i-1 

where  the  ai  are  calculated  via  our  computational  procedure. 

We  see  that  (5.3)  could  also  be  used  as  an  estimate  for  the  density  £(•)  under  the 

assumption  that  the  sampling  (rather  than  marginal)  density  of  x  f...,x  is  equal  to 

l  m 

S(x),  These  are  close  similarities  with  nonparametric  kernel  estimators  of  the  form 


These  are  prevalent  in  the  literature;  see  Silverman  (1978)  for  some  recent 

developments.  The  estimate  £•  averages  the  kernels  f(|x  -  x^ | )  centered  on  the  data 

points#  rather  than  centered  on  a.#...# a  #  as  in  (5.3). 

I  m 

Kernel  estimators  are  open  to  criticism  on  the  following  grounds 

(i)  They  tend  to  lead  to  estimators  which  are  too  "flat*.  The  variance  corresponding 
to  t*(x)  is  theoretically  always  longer  than  the  sample  variance  of  the  observations. 

(ii)  When  an  equal  kernel  is  placed  over  each  data  point#  then,  according  to  its 

spread#  the  estimator  very  often  tends  to  be  either  too  flat#  or  too  bumpy  in  the  details. 

2 

(iii)  When,  say,  f  is  a  normal  density  with  mean  zero  and  variance  o  ,  the  value 
o  1  is  referred  to  as  the  "band  width”  and  regulates  the  degree  of  smoothing.  It  is 

notoriously  difficult  to  obtain  a  reasonable  analytic  method  for  estimating  Q2  from  the 
data. 

Our  procedure  promises  to  answer  all  three  criticisms.  Firstly,  as  the  a^  are  more 

compressed  than  the  x^  the  estimator  t  in  (5.3)  will  always  be  less  flat.  Secondly#  by 

estimating  the  a^  according  to  the  scatter  of  the  data  it  will  avoid  many  of  the  problems 

in  (iii).  Thirdly#  when  f  is  a  normal  (or  other  symmetric)  density  with  scale  parameter 
2  2 

a  we  may  estimate  a  as  well.  In  the  normal  case  we  may  use  equations  (2.12)-(2.14) 

with  single  replications  n^^  «  1,  when  the  equations  still  possess  enough  structure  to 
2 

sensibly  estimate  0  • 
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The  kernel  ideas  will  be  pursued  in  greater  detail  elsewhere 
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